Joint processing of audio and visual information for multimedia indexing and human-computer interaction

نویسندگان

Chalapathy Neti

Benoît Maison

Andrew W. Senior

Giridharan Iyengar

P. Decuetos

Sankar Basu

Ashish Verma

چکیده

Information fusion in the context of combining multiple streams of data e.g., audio streams and video streams corresponding to the same perceptual process is considered in a somewhat generalized setting. Speci cally, we consider the problem of combining visual cues with audio signals for the purpose of improved automatic machine recognition of descriptors e.g., speech recognition/transcription, speaker change detection, speaker identi cation and speaker event detection. These happen to be important descriptors for multimedia content (video) for e cient search and retrieval. A general framework for considering all of these fusion problems in a uni ed setting is considered.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Perceptual interfaces for information interaction: joint processing of audio and visual information for human-computer interaction

We are exploiting the human perceptual principle of sensory integration (the joint use of audio and visual information) to improve the recognition of human activity (speech recognition, speech event detection and speaker change), intent (intent to speak) and human identity (speaker recognition), particularly in the presence of acoustic degradation due to noise and channel. In this paper, we pre...

متن کامل

Audio-visual interaction in multimedia communication

To many people, the word “multimedia” simply means the combination of various forms of information: text, speech, music, images, graphics and video. What is often overlooked is the interaction among these forms. In this paper, we will present our recent results in exploiting the audio-visual interaction that is very significant in multimedia communication. The applications include lip synchroni...

متن کامل

Look Who's Talking: Speaker Detection using Video and Audio Correlation

The visual motion of the mouth and the corresponding audio data generated when a person speaks are highly correlated. This fact has been exploited for lip/speechreading and for improving speech recognition. We describe a method of automatically detecting a talking person (both spatially and temporally) using video and audio data from a single microphone. The audio-visual correlation is learned ...

متن کامل

The effects of segmentation and redundancy methods on cognitive load and vocabulary learning and comprehension of English lessons in a multimedia learning environment

The present study was conducted with the aim of the effects of segmentation and redundancy methods on cognitive load and vocabulary learning and comprehension of English lessons in a multimedia learning environment.The purpose of this study is an applied research and a real experimental study. The statistical population of the present study includes all people aged 14 to 16 who are enrolled in ...

متن کامل

Eye-Tracking Method’ Usage for Understanding the Cognitive Processes in Multimedia Learning

Introduction: Designing multimedia learning environments should consist of the evidence-based study and principals about the human learning process. Eye tracking is a way based on the learner processing of learning materials which presented in multimedia learning environments. The aim of the study was to examine the use of the eye-tracking method to investigate the cognitive processes in m...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

Joint processing of audio and visual information for multimedia indexing and human-computer interaction

نویسندگان

چکیده

منابع مشابه

Perceptual interfaces for information interaction: joint processing of audio and visual information for human-computer interaction

Audio-visual interaction in multimedia communication

Look Who's Talking: Speaker Detection using Video and Audio Correlation

The effects of segmentation and redundancy methods on cognitive load and vocabulary learning and comprehension of English lessons in a multimedia learning environment

Eye-Tracking Method’ Usage for Understanding the Cognitive Processes in Multimedia Learning

عنوان ژورنال:

اشتراک گذاری